Scientific IPython Setup

We need Python's plotting library, matplotlib. Your environment might load matplotlib automatically, but for this tutorial I'll load it explicitly using this convention. If you are unfamiliar with matplotlib, do the same as I do here, and everything that follows will work without modification.



In [2]:

    
%matplotlib inline
import matplotlib.pyplot as plt

We also might want to use scientific Python libraries. Finally, we'll import mr itself.



In [1]:

    
import numpy as np
import pandas as pd
from pandas import DataFrame, Series  # for convenience

import mr

Optionally, tweak the plotting styles.



In [29]:

    
plt.rcParams['figure.figsize'] = 10, 6
plt.rcParams['image.cmap'] = 'gray'

Locate Features

Open a video

Load a video file as a frames. This object is an iterator that returns a frame each time as its called. The frame is a numpy.ndarray of brightness values.



In [3]:

    
frames = mr.Video('../mr/tests/water/bulk-water.mov')



In [4]:

    
frames









    Out[4]:





<Frames>
Source File: ../mr/tests/water/bulk-water.mov
Frame Dimensions: 640 x 424
Cursor at Frame 0 of 480

The Video constructor assumes brightfield video, inverting black and white. (See image below.) To process video of fluorescent particles, include the keyword argument invert=False.



In [5]:

    
mr.play(frames) # spawn a window playing the video

Find features in one frame.

Take a look at, say, Frame 5.



In [30]:

    
plt.imshow(frames[5], cmap=plt.cm.gray)









    Out[30]:





<matplotlib.image.AxesImage at 0x1203f1bd0>

Locate bright features with a diameter of about 11 px. The size must be an odd integer, and it is better to err on the large side. We will see why below.



In [32]:

    
f = mr.locate(frames[5], 11)

mr.locate returns a spreadsheet-like object called a DataFrame. It lists

each feature's position,
various characterizations of its appearance, which we will use to filter out spurrious features,
the "signal" strength and an estimate of uncertainty, both derived from this paper

(DataFrames are provided by the pandas module, and you can read more about them in the pandas documentation. They can easily be exported to a variety of formats.)



In [8]:

    
f.head() # shows the first few rows of data









    Out[8]:






  
    
      
      x
      y
      mass
      size
      ecc
      signal
      ep
    
  
  
    
      0
        69.227507
       5.805180
       3755
       2.846401
       0.066815
       21.609637
       0.095450
    
    
      1
       205.860481
       6.449213
       4414
       2.908863
       0.149883
       24.609637
       0.080542
    
    
      2
       336.196881
       6.378277
       2264
       2.133259
       0.082158
       24.609637
       0.146473
    
    
      3
        36.281895
       8.087197
       4262
       2.920783
       0.099597
       22.609637
       0.087016
    
    
      4
       277.321270
       7.803879
       2729
       2.808470
       0.160056
       14.609637
       0.144733
    
  

5 rows × 7 columns



In [33]:

    
mr.annotate(f, frames[5])









    












    Out[33]:





<matplotlib.axes.AxesSubplot at 0x120b66610>

Many of these are spurious; they are fleeting peaks in brightness that aren't actually particles. There are many ways to distinguish real particles from spurrious ones. The most important way is to look at total brightness ("mass").



In [34]:

    
ax = f['mass'].hist()
ax.set_xlabel('mass')
ax.set_ylabel('count')









    Out[34]:





<matplotlib.text.Text at 0x120b4bfd0>



In [11]:

    
f = mr.locate(frames[5], 11, minmass=2000)



In [35]:

    
mr.annotate(f, frames[5])









    












    Out[35]:





<matplotlib.axes.AxesSubplot at 0x120b455d0>

There are more options in mr.locate, which you can review in the docstring using help(mr.locate) or mr.locate?.



In [13]:

    
help(mr.locate)









    



Help on function locate in module mr.feature:

locate(image, diameter, minmass=100.0, maxsize=None, separation=None, noise_size=1, smoothing_size=None, threshold=1, invert=False, percentile=64, topn=None, preprocess=True)
    Locate Gaussian-like blobs of a given approximate size.
    
    Preprocess the image by performing a band pass and a threshold.
    Locate all peaks of brightness, characterize the neighborhoods of the peaks
    and take only those with given total brightnesss ("mass"). Finally,
    refine the positions of each peak.
    
    Parameters
    ----------
    image : image array (any dimensions)
    diameter : feature size in px
    minmass : minimum integrated brightness
        Default is 100, but a good value is often much higher. This is a
        crucial parameter for elminating spurrious features.
    maxsize : maximum radius-of-gyration of brightness, default None
    separation : feature separation, in pixels
        Default is the feature diameter + 1.
    noise_size : width of Gaussian blurring kernel, in pixels
        Default is 1.
    smoothing_size : size of boxcar smoothing, in pixels
        Default is the same is feature separation.
    threshold : Clip bandpass result below this value.
        Default 1; use 8 for 16-bit images.
    invert : Set to True if features are darker than background. False by
        default.
    percentile : Features must have a peak brighter than pixels in this
        percentile. This helps eliminate spurrious peaks.
    topn : Return only the N brightest features above minmass. 
        If None (default), return all features above minmass.
    preprocess : Set to False to turn out automatic preprocessing.
    
    Returns
    -------
    DataFrame([x, y, mass, size, ecc, signal])
        where mass means total integrated brightness of the blob,
        size means the radius of gyration of its Gaussian-like profile,
        and ecc is its eccentricity (1 is circular).
    
    See Also
    --------
    batch : performs location on many images in batch
    
    Notes
    -----
    This is an implementation of the Crocker-Grier centroid-finding algorithm.
    [1]_
    
    References
    ----------
    .. [1] Crocker, J.C., Grier, D.G. http://dx.doi.org/10.1006/jcis.1996.0217

All functions in mr are documented in this way.

Check for subpixel accuracy.

As Eric Weeks points out in the tutorial for his tutorial, a quick way to check for subpixel accuracy is to check that the decimal part of the x and/or y positions are evenly distributed. mr provides a convenience function for this:



In [36]:

    
mr.subpx_bias(f)









    












    Out[36]:





<matplotlib.axes.AxesSubplot at 0x120b3a650>

If we use a mask size that is too small, the histogram often shows a dip in the middle.



In [37]:

    
mr.subpx_bias(mr.locate(frames[5], 7))









    












    Out[37]:





<matplotlib.axes.AxesSubplot at 0x11f9d7350>

Find features in all frames.

You can select:

all frames, frames[:]
a range of frames, like frames[100:200]
all frames after a starting frames, like frames[100:]
a list of specific frames, like frames[[100, 107, 113]]

We'll locate features in the first 300 frames from this video. We use mr.batch, which calls mr.locate on each frame and collects the results.



In [16]:

    
f = mr.batch(frames[:300], 11, minmass=2000)









    



Frame 300: 455 features

As each frame is analyzed, mr.batch reports the frame number and how many features were found. If this number runs unexpectedly low or high, you may wish to interrupt it and try different parameters.

If you have OpenCV installed (optional) you can view the circled features as a video.



In [17]:

    
mr.circle(f, frames[:300])  # opens a separate window









    



Press Ctrl+C to interrupt video.

Link features into particle trajectories

We have the locations of the particles in each frame. Next we'll associate each particle with an ID number and track it from frame to frame.

First, we must must specify a maximum displacement, the farthest a particle is likely to travel between frames. We must not underestimate how far a particle might travel, but we should choose the smallest reasonable value because a large value slows computation time considerably. In this case, 5 pixels is reasonable.

Second, we allow for the possibility that a particle might be missed in one frame and then seen again in the next frame. (Perhaps its "mass" slipped below our cutoff due to noise in the video.) Memory keeps track of disappeared particles and maintains their ID for up to some number of frames after their last appearance. We'll choose 3.



In [18]:

    
t = mr.link(f, 5, memory=3)









    



Frame 300: 455 trajectories present

The result contains the information from the features f with an additional column, probe, identifying each feature with a label.



In [20]:

    
t.head()









    Out[20]:






  
    
      
      x
      y
      mass
      size
      ecc
      signal
      ep
      frame
      probe
    
  
  
    
      0
       233.285353
       13.437998
       3558
       2.414065
       0.120121
       19.458034
       0.148157
       0
       0
    
    
      1
       233.311850
       13.818745
       3638
       2.559517
       0.048906
       19.536512
       0.130360
       1
       0
    
    
      2
       233.787198
       14.607853
       3261
       2.413864
       0.097972
       19.515001
       0.151275
       2
       0
    
    
      3
       234.489771
       14.494402
       3466
       2.456723
       0.086164
       19.572660
       0.140946
       3
       0
    
    
      4
       234.400785
       14.219731
       3568
       2.563488
       0.098350
       19.575473
       0.131169
       4
       0
    
  

5 rows × 9 columns

Filter spurrious trajectories.

We have more filtering to do. Empheremeral trajectories -- seen only for a few frames -- are usually spurrious and never useful. The convenience function filter_stubs keeps only trajectories that last for a given number of frames.



In [23]:

    
t1 = mr.filter_stubs(t, 50)
# Compare the number of probes in the unfiltered and filtered data.
t['probe'].nunique(), t1['probe'].nunique()









    Out[23]:





(3879, 715)

We can also filter trajectories by their appearance. At this stage, with trajectories linked, we can look at a feature's "average appearance" throughout its trajectory, giving a more accurate picture.



In [38]:

    
mr.mass_size(t1.groupby('probe').mean())  # convenience function -- just plots size vs. mass









    












    Out[38]:





<matplotlib.axes.AxesSubplot at 0x121ea5c50>

The probes with especially low mass or especially large size are probably out of focus or aggregated, respectively. It is best to experiment by trial and error, filtering out regions of mass-size space and looking at the results using mr.annotate and mr.circle. In the end, we need to separate the good probes from the spurrious ones, and it doesn't matter how we get it done.



In [47]:

    
condition = lambda x: (x['mass'].mean() > 2800) & (x['size'].mean() < 3.0) & (x['ecc'].mean() < 0.1)
t2 = mr.filter(t1, condition)  # a wrapper for pandas' filter that works around a bug in v 0.12



In [48]:

    
mr.annotate(t2[t2['frame'] == 0], frames[0])









    












    Out[48]:





<matplotlib.axes.AxesSubplot at 0x12057fe90>

Trace the trajectories.



In [49]:

    
mr.plot_traj(t1)









    












    Out[49]:





<matplotlib.axes.AxesSubplot at 0x1243f8dd0>

Remove overall drift

Compute the overall drifting motion, which we will subtract away, adopting the reference frame of the probes' average position.



In [50]:

    
d = mr.compute_drift(t1)



In [51]:

    
d.plot()









    Out[51]:





<matplotlib.axes.AxesSubplot at 0x122847e90>



In [52]:

    
tm = mr.subtract_drift(t1, d)

With the overall drifting motion subtracted out, we plot the trajectories again. We observe nice random walks.



In [53]:

    
mr.plot_traj(tm)









    












    Out[53]:





<matplotlib.axes.AxesSubplot at 0x11f019450>

Analyze trajectories

Mean Squared Displacement of Individal Probes

Compute the mean squared displacement of each particle and plot MSD vs. lag time.



In [54]:

    
im = mr.imsd(tm, 100/285., 24) # microns per pixel = 100/285., frames per second = 24



In [62]:

    
im.plot(loglog=True, style='k-', alpha=0.1, legend=False) # black lines, semitransparent, no legend
plt.gca().set_ylabel(r'$\langle \Delta r^2 \rangle$ [$\mu$m$^2$]');

Since we only analyzed 300 frames, the statistics are poor at large lag times. With more frames, we can study larger lag times.

Ensemble Mean Squared Displacement



In [57]:

    
em = mr.emsd(tm, 100/285., 24)



In [58]:

    
em.plot(loglog=True, style='o')









    Out[58]:





<matplotlib.axes.AxesSubplot at 0x122c12d50>

We can easily fit this to a power law using a convenience function, fit_powerlaw, that performs a linear regression in log space.



In [63]:

    
em.plot(loglog=True, style='ro')
mr.utils.fit_powerlaw(em)









    



hi






    












    Out[63]:






  
    
      
      n
      A
    
  
  
    
      msd
       1.046943
       1.681362
    
  

1 rows × 2 columns

In water, a viscous material, the expected power-law exponent $n = 1$. The value of $A = 4D$, where $D$ is the particles' diffusivity. $D$ is related to viscosity $\eta$, particle radius $a$, and temperature $T$ as

$D = \displaystyle\frac{kT}{6\pi\eta a}$.

For paritcles with a 1 $\mu\text{m}$ diameter in room-temperature water, $A \approx 1.74$. Our value above is not far off.

	x	y	mass	size	ecc	signal	ep
0	69.227507	5.805180	3755	2.846401	0.066815	21.609637	0.095450
1	205.860481	6.449213	4414	2.908863	0.149883	24.609637	0.080542
2	336.196881	6.378277	2264	2.133259	0.082158	24.609637	0.146473
3	36.281895	8.087197	4262	2.920783	0.099597	22.609637	0.087016
4	277.321270	7.803879	2729	2.808470	0.160056	14.609637	0.144733

	x	y	mass	size	ecc	signal	ep	frame
0	233.285353	13.437998	3558	2.414065	0.120121	19.458034	0.148157	0
1	233.311850	13.818745	3638	2.559517	0.048906	19.536512	0.130360	1
2	233.787198	14.607853	3261	2.413864	0.097972	19.515001	0.151275	2
3	234.489771	14.494402	3466	2.456723	0.086164	19.572660	0.140946	3
4	234.400785	14.219731	3568	2.563488	0.098350	19.575473	0.131169	4